2,234 research outputs found
Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold
Sentiment analysis over Twitter offers organisations and individuals a fast and effective way to monitor the publics' feelings towards them and their competitors. To assess the performance of sentiment analysis methods over Twitter a small set of evaluation datasets have been released in the last few years. In this paper we present an overview of eight publicly available and manually annotated evaluation datasets for Twitter sentiment analysis. Based on this review, we show that a common limitation of most of these datasets, when assessing sentiment analysis at target (entity) level, is the lack of distinctive sentiment annotations among the tweets and the entities contained in them. For example, the tweet "I love iPhone, but I hate iPad" can be annotated with a mixed sentiment label, but the entity iPhone within this tweet should be annotated with a positive sentiment label. Aiming to overcome this limitation, and to complement current evaluation datasets, we present STS-Gold, a new evaluation dataset where tweets and targets (entities) are annotated individually and therefore may present different sentiment labels. This paper also provides a comparative study of the various datasets along several dimensions including: total number of tweets, vocabulary size and sparsity. We also investigate the pair-wise correlation among these dimensions as well as their correlations to the sentiment classification performance on different datasets
Recommended from our members
Large-scale social-media analytics on stratosphere
The importance of social-media platforms and online communities - in business as well as public context - is more and more acknowledged and appreciated by industry and researchers alike. Consequently, a wide range of analytics has been proposed to understand, steer, and exploit the mechanics and laws driving their functionality and creating the resulting benefits. However, analysts usually face significant problems in scaling existing and novel approaches to match the data volume and size of modern online communities. In this work, we propose and demonstrate the usage of the massively parallel data processing system Stratosphere, based on second order functions as an extended notion of the MapReduce paradigm, to provide a new level of scalability to such social-media analytics. Based on the popular example of role analysis, we present and illustrate how this massively parallel approach can be leveraged to scale out complex data-mining tasks, while providing a programming approach that eases the formulation of complete analytical workflows
Recommended from our members
A Linked Open Data Approach for Sentiment Lexicon Adaptation
Social media platforms have recently become a gold mine for organisations to monitor their reputation by extracting and analysing the sentiment of the posts generated about them, their markets, and competitors. Among the approaches to analyse sentiment from social media, approaches based on sentiment lexicons (sets of words with associated sentiment scores) have gained popularity since they do not rely on training data, as opposed to Machine Learning approaches. However, sentiment lexicons consider a static sentiment score for each word without taking into consideration the different contexts in which the word is used (e.g, great problem vs. great smile). Additionally, new words constantly emerge from dynamic and rapidly changing social media environments that may not be covered by the lexicons. In this paper we propose a lexicon adaptation approach that makes use of semantic relations extracted from DBpedia to better understand the various contextual scenarios in which words are used. We evaluate our approach on three different Twitter datasets and show that using semantic information to adapt the lexicon improves sentiment computation by 3.7% in average accuracy, and by 2.6% in average F1 measure
Recommended from our members
On stopwords, filtering and data sparsity for sentiment analysis of Twitter
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier’s feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
Recommended from our members
The quest for information retrieval on the semantic web
Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontology-based KBs to improve search over large document repositories. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm, and a ranking algorithm. Semantic search is combined with keyword-based search to achieve tolerance to KB incompleteness. Our proposal has been tested on corpora of significant size, showing promising results with respect to keyword-based search, and providing ground for further analysis and research
Estudio sobre el desarrollo moral y los dilemas morales en niños y niñas de 8 a 12 años
El desarrollo moral en edades tempranas es un factor clave en el desarrollo general de un individuo, pues moldea el comportamiento posterior en su edad adulta. Por ello, este trabajo se centra en el estudio de dicho desarrollo moral aplicando diversos dilemas morales en niños y niñas cuyas edades se comprenden entre los 8 y los 12 años, punto en que los niños toman una clara y fuerte conciencia de la moralidad en general. Para completar el estudio, se realiza una comparación entre los resultados obtenidos y los resultados que se esperan según la teorÃa sobre el desarrollo moral de Kohlberg, teorÃa en que se sustenta este trabajo. Además, se lleva a cabo también un análisis comparativo entre los resultados en función del sexo y la edad del sujeto
Intervenciones no farmacológicas en la osteoporosis
Introducción:
La osteoporosis es el proceso metabólico óseo más frecuente, caracterizado por la disminución de la resistencia ósea, que conlleva un aumento del riesgo de fracturas. Es causa además de un problema sanitario de primera magnitud en todo el mundo por su elevado coste social y económico y su elevada prevalencia.
Los pacientes que sufren osteoporosis presentan a menudo trastornos psicológicos, dificultades en la adherencia al tratamiento, asà como problemas fÃsicos relacionados con el dolor, la discapacidad o las deformidades que la enfermedad puede causar.
Objetivo:
Identificar las intervenciones no farmacológicas más efectivas disponibles para el control y la prevención de la osteoporosis.
Método: Revisión sistemática de la literatura
Discusión:
La prevención es el tratamiento más eficaz para combatir la osteoporosis. El rol de la Enfermera es fundamental, para mejorar la calidad de vida de estos pacientes promoviendo hábitos de vida saludables.
Los programas de educación son considerados parte esencial del tratamiento no farmacológico en pacientes con osteoporosis. Los cuidados enfermeros, en esta patologÃa, son fundamentales para facilitar el autocuidado y el afrontamiento eficaz de la enfermedad.
Existen suficientes recomendaciones para realizar intervenciones no farmacológicas en el tratamiento de la osteoporosis, como son una dieta equilibrada, rica en calcio y vitamina D, ejercicio y una buena educación sanitaria por parte de la enfermera.Grado en EnfermerÃ
A sensitive spectrophotometric method for lead determination by flow injection analysis with on-line preconcentration
A new flow injection (FI) system for the determination of Pb(II) at trace level with a preconcentration step and spectrophotometric detection is proposed. It is based on preconcentration of lead ions on chitosan and dithizone-lead complex formation in aqueous medium (pH 9). The chemicals and FIA variables influencing the performance of the system were optimized and applied to the determination of lead in natural, well, and drinking water samples. It is a simple, highly sensitive, and low cost alternative methodology. The method provided a linear rage between 25 and 250 μg l-1, a detection limit of 5.0 ng ml-1 and a sample throughput of 15 h-1. The obtained results of spiked samples are in good agreement between the proposed method and ICP-AES.Fil: Di Nezio, Maria Susana. Universidad Nacional del Sur. Departamento de QuÃmica; ArgentinaFil: Palomeque, Miriam Edid. Universidad Nacional del Sur. Departamento de QuÃmica; ArgentinaFil: Fernández Band, Beatriz Susana. Universidad Nacional del Sur. Departamento de QuÃmica; Argentin
Recommended from our members
Using social media to inform policy making: to whom are we listening?
Domination of social media is giving today’s web users a venue for expressing their views and sharing their experiences with others. With well over a billion active users, social networking sites (SNS) have become dynamic sources of information on peoples’ interests, needs and opinions and are considered an extremely rich source of content to reach out to many millions of people. This is creating a revolutionary opportunity for governments to learn about the citizens and to engage with them more effectively. The potential is there for eParticipation applications to go from simply informing the public to unprecedented levels of interaction and engagement between Policy Makers (PMs) and the community, involving the public in deliberation processes leading to legislation.
Despite its great potential, several concerns arise from the exploitation of social media, especially when used to inform policy making. Among these issues we can highlight the lack of awareness of the characteristics of those citizens discussing policy topics in social media, and lack of awareness of the characteristics of their discussions. Although some studies have emerged in the last few years that aim to capture the demographics of social media users (e.g., gender, age, geographical locations) they tend not to focus on those specific users participating in policy discussions. Understanding who are the users discussing policy in social media and how policy topics are debated could help assessing how their views and opinions should be weighted and considered to inform policy making.
Aiming to provide a step forward in this direction, this paper investigates the characteristics of over 8K users involved in policy discussions in Twitter. These discussions were collected by monitoring, for one week, 42 different political topics selected by sixteen PMs from different political institutions in Germany. Our results indicate that: (i) a high volume of conversations around policy topics does not come from citizens, but from news agencies and other organisations, (ii) the average user discussing policy topics in Twitter is more active, popular and engaged than the average Twitter user and, (iii) users engaged in social media conversations around policy topics tend to be geographically concentrated in constituencies with high population density. Regarding the analysed conversations, a small subset of topics is extensively discussed while the majority go relatively unnoticed
- …